Training apparatus and method

ABSTRACT

Training apparatus for training a user to engage in transactions (e.g. a foreign language conversation) with another person whom the apparatus is arranged to simulate, the apparatus comprising:
         an input for receiving input dialogue from a user;   a lexical store containing data relating to individual words of said input dialogue;   a rule store containing rules specifying grammatically allowable relationships between words of said input dialogue;   a transaction store containing data relating to allowable transactions between said user and said person;   a processor arranged to process the input dialogue to recognise the occurrence therein of words contained in said lexical store in the relationships specified by the rules contained in said rule store in accordance with the data specified in the transaction store, and to generate output dialogue indicating when correct input dialogue has been recognised; and   an output device for making the output dialogue available to the user.

BACKGROUND OF THE INVENTION

1. Field of the invention

This invention relates to apparatus and methods for training;particularly, but not exclusively, for language training.

2. Description of Related Art

In language training, various different skills may be developed andtested. For example, our earlier application GB 2242772, discloses anautomated pronunciation training system, in some respects improving uponthe well known “language laboratory” automated test equipment.

Training and dialogue is carried out by human teachers who areexperienced in the target language (i.e. the language to be learn). Insuch training, the teacher will understand what is being said, even whenthe grammar is imperfect, and can exercise judgment in indicating when aserious or trivial mistake is made, and in explaining what the correctform should be.

Ultimately, it may become possible to provide a computer which wouldduplicate the operation of such a language teacher, in properlycomprehending the words of a student, carrying out a full dialogue, andindicating errors committed by the student. However, although the fieldsof artificial intelligence and machine understanding are advancing, theyhave not as yet reached this point.

EP-A-0665523 briefly discloses a foreign language skills maintenancesystem, in which role playing is permitted, comprising an input forreceiving input dialogue from a user and an output at which the“correct” dialogue which would be anticipated from the user isdisplayed, for comparison with the input dialogue by the user (or by thecomputer).

An object of the present invention is to provide a training system(particularly for language training but possibly applicable more widely)which utilized limited volumes of memory to store limited numbers ofwords and grammatical data, but is nonetheless capable of recognizinginput language errors and of carrying on a dialogue with a student.

SUMMARY OF THE INVENTION

In an embodiment, the present invention provides a display of a person,and is arranged to vary the display to have different expressions,corresponding to comprehension, and at least one degree ofincomprehension. Preferably, two degrees of incomprehension areprovided; one corresponding to an assumed error in an otherwisecomprehensible input and the other corresponding to incomprehensibleinput.

In an embodiment, a display is provided which indicates target languageresponses generated by the invention, together with text (preferably inthe target language) indicating the level of comprehension achieved.Thus, an error is indicated without interrupting the target languagedialogue.

Preferably, in an embodiment, the invention provides for the generationof source language text for the guidance of the student. Preferably, thesource language text is normally hidden and is displayed on command bythe user.

Very preferably, the source language text comprises guidance as to whatthe last target language output text means.

Very preferably, the guidance text comprises an explanation of what anydetected error is assumed to be.

Very preferably, the guidance text comprises text indicating whatsuitable next responses by the student might be.

Alternatively, the invention may comprise speech recognition means forthe input of speech and/or speech synthesis means for the generation ofspeech, to replace input and/or output text in the above embodiments.

Preferably, the invention comprises a terminal for use by the student atwhich input is accepted and output is generated, and a remote computerat which the processing necessary to convert each input from the user tocorresponding outputs is performed, the two being linked together by atelecommunications channel. This arrangement permits the processingresources required to be centralized, rather than requiring them to bepresent for each user (language student). It also provides for effectiveuse of the telecommunications channel, since much of the traffic isrelative low bandwidth text information.

Preferably, in this embodiment, the telecommunications channel comprisesthe network of high bandwidth links interconnecting computer sites knownas the “Internet”. Where this is the case, the invention mayconveniently be realized as a mobile program (“applet”) which isdownloaded initially, and operates with conventional residentcommunications programs referred to as “HTML browsers”.

In an embodiment, the invention operates by reference to data relatingto words, and data relating to grammatical rules.

This enables a far greater range of input and output dialogue, for thesame memory usage, than direct recognition and/or generation of dialoguephrases.

The presence of errors may be detected by providing a first set of ruleswhich are grammatically correct, and associated with each of the firstset, a respective second set of rules each of which relaxes a constraintof the respective first rule to which it relates. Input text is thenparsed by using rules of the first set and, at least where this isunsuccessful, rules of the second sets; where text is successfullyparsed by a rule of the second set but not by the first set rule towhich that second set relates, the error determined to be present isthat corresponding to the constraint which was relaxed in the rule ofthe second set.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 2 is a block diagram showing in greater detail the structure of auser interface terminal forming part of FIG. 1;

FIG. 3 is an illustrative diagram of the display shown on a displaydevice forming part of the terminal of FIG. 2;

FIGS. 4 a-4 d are exemplary displays shown on the display of FIG. 3;

FIG. 5 is a block diagram showing schematically the structure of a hostcomputer forming part of FIG. 1;

FIG. 6 is a flow diagram showing schematically the general processperformed by the user interface terminal of FIG. 2;

FIG. 7 illustrates the structure of a control message transmitted fromthe host computer of FIG. 5 to the user interface terminal of FIG. 2;

FIG. 8 is a diagram showing schematically the contents of a storeforming part of the host computer of FIG. 5;

FIG. 9 (comprising FIGS. 9 a-9 f) is a flow diagram showingschematically the process of operation of the host computer of FIG. 5.

DESCRIPTION OF THE PREFERRED EMBODIMENTS

Referring to FIG. 1, the system of a first embodiment of the inventioncomprises a terminal 10 such as a personal computer connected, via atelecommunications link 12 such as a telephone line, to atelecommunications network 14 such as the Internet, which in turn isconnected to a host computer 20. Both the terminal 10 and the hostcomputer 20 are conveniently arranged to communicate in a common filetransfer protocol such as TCP/IP.

Referring to FIG. 2, the terminal 10 comprises a central processing unit102, a keyboard 104, a modem 106 for communication with thetelecommunications link 12, a display device 108 such as a CRT, and astore 110, schematically indicated as a single unit but comprising readonly memory, random access memory, and mass storage such as a hard disk.These are interconnected via a bus structure 112.

Within the store 110 is a frame buffer area, to which pixels of thedisplay device 108 are memory mapped. The contents of the frame buffercomprise a number of different window areas when displayed on thedisplay device 108, as shown in FIG. 3; namely, an area 302 defining aninput text window; an area 304 carrying a visual representation of aperson; an area 306 defining an output text window; an area 308 defininga comprehension text window; an area 310 displaying a list of possibleitems; an area 312 defining a transaction result window; and an area 314defining a user guidance window. The CPU 102 is arranged selectively tohide the response guidance window 314, and to display an icon 315, theresponse guidance window being displayed only when the icon 315 isselected via the keyboard or other input device.

FIG. 4 a illustrates the appearance of the display device 108 in use;the response guidance display area 314 is hidden, and icon 315 isdisplayed.

Also stored within the store 110 are a set of item image data files,represented in a standardized format such as for example a .GIF or .PICformat, each being sized to be displayed within the transaction resultarea 312, and a set of expression image data files defining differentexpressions of the character displayed in the person area 304. Finally,data defining a background image is also stored.

Referring to FIG. 5, the host computer 20 comprises a communicationsport 202 connected (e.g. a via an ISDN link) to the internet 12; acentral processing unit 204; and a store 206. Typically, the hostcomputer 20 is a mainframe computer, and the store comprises a largescale off line storage system (such as a RAID disk system) and randomaccess memory.

Control and Communications

The terminal 10 and host computer 20 may operate under conventionalcontrol and communications programs. In particular, in this embodimentthe terminal 10 may operate under the control of a GUI such as Windows(TM) and a Worldwide Web browser such as Netscape (TM) Navigator (TM)which is capable of receiving and running programs (“Applets”) receivedfrom the Internet 12. The host computer 20 may operate under the controlof an operating system such as Unix (TM) running a Worldwide Web serverprogram (e.g. httpd). In view of the wide availability of such operatingprograms, further details are unnecessary here.

General Overview of System Behavior

In this embodiment, the scenario used to assist in language training isthat of the grocer's shop selling a variety of foods.

The object of the present embodiment is to address input text in thetarget language to the grocer. If the text can be understood as aninstruction to supply a type of item, this will be confirmed with visualfeedback of several types; firstly, a positive expression will bedisplayed on the face of the grocer (area 304); secondly, the requesteditem will appear in the grocery basket transaction area (area 312)displayed on the screen 108; and thirdly the instruction will beconfirmed by output text in the target language from the grocer (area306).

If the input text can be understood as an instruction to purchase anitem, but contains recognized spelling or grammatical errors, visualfeedback of the transaction is given in the form of a confirmation ofwhat the understood transaction should be as output text, and thedisplay of the item in the grocery basket (area 312).

However, the existence of the error is indicated by the selection of anegative displayed expression on the face of the grocer (area 304), anda general indication as to the nature of the error is given bydisplaying text in the target language in a window indicating thegrocer's thoughts (area 308).

This may be sufficient, taken with the user's own knowledge, to indicateto the user what the error is; if not, the user may select furtherassistance, in which case user guidance text indicating in more detail,in the source language, what the error is thought to be is displayed.

If the input text cannot be understood because one or more words (afterspell correction) cannot be recognized, a negative expression isdisplayed in the face of the grocer (area 304) and output text in thetarget language is generated in the area 306 to question theunrecognized words.

If the words in the input text were all recognized but the text itselfcannot be recognized for some other reason, then a negative expressionis generated on the face of the grocer (304) and output text in thetarget language is generated in area 306 recording a failure tounderstand.

In such cases of complete lack of comprehension, a facial expressiondiffering from the partial incomprehension shown in FIG. 4 c is selectedfor display.

Operation of Terminal 10

Referring to FIG. 6, to initiate use of the system, the user sets up aconnection to the host computer 20 from the terminal 10 (step 402). Instep 404, a program (applet) for controlling the display of the imagedata is downloaded.

The host computer 20 then downloads a file of data representing thebackground image, a plurality of files of data representing thedifferent possible expressions of the grocer, and a plurality of filesof data representing all the items on sale, in step 406.

In step 408, initial control data is received from the computer 20, inthe form of a control data message 500 which, as shown in FIG. 7,comprises a target language output text string 506, corresponding towords to be spoken by the grocer and hence to be displayed in thedisplay area 306; a source language user guidance text string 514 to bedisplayed in the user guidance display area 314 if this is selected fordisplay by the user; one or more item symbols 512 which will cause theselection for display of the images of one or more items in the displayarea 312; an expression symbol 504 for selecting one of the downloadedexpression image files for display on the face of the grocer in thedisplay area 304; and a target language comprehension text string 508for display in the display area 308 to indicate what the grocer wouldunderstand by target language text input by a user as described below.

In the initial message transmitted in step 408, the item symbol field512 and comprehension text field 508 are both empty.

In step 410, the CPU 102, under control of the program downloaded instep 404, first loads the background image to the frame store within thestorage unit 110, and then overwrites the areas 304, 306, and, whereapplicable, 312 and 314; by generating image data representing the textstrings and inserting it in the relevant windows 306, 308, 314; byselecting the facial expression image indicated by the expression symbol504 and displaying this in the upper area of the person display area304; and by selecting an item image indicated by the item symbol anddisplaying these in the area 312.

With the exception of the window 302 (which would at this stage beempty), the appearance of the display unit 108 at this stage is as shownin FIG. 4 a.

Thus, the background display consists of the display of all the itemimages in the display area 310 together with a corresponding text labelindicating, in each case, the item name; the display of the icon 315indicating tutorial assistance; the display of the figure of a grocerwith one of the selected expressions; the display of a speech bubblecontaining the grocer's speech output 306; and the display of a basket312 receiving items placed therein by the grocer in response to shoppinginstructions.

If, in step 412, an instruction to log off or exit is input by the user,the process terminates. Otherwise, the CPU 102 scans the keyboard 104(step 414) for the input of a string of text terminated by a carriagereturn or other suitable character, which is displayed in the input textdisplay area 302 and, when input is complete, transmitted to thecomputer 20 in step 416 via the modem and Internet 12.

In response to the transmission of input text in step 416, the computer20 returns another control message 500 (received in step 418) and, inresponse thereto, the terminal returns 10 to step 410 to update thedisplay to reflect the contents of the control message.

Thus, referring to FIG. 4 b, the result of the input of the text stringshown in area 302 of FIG. 4 a is to cause the display of the textmessage “Voila un kilo de pommes! Et avec ca?” in the output text area306 (this representing the contents of the field 506 of the receivedcontrol message).

Field 504 contains a symbol corresponding to a cheerful or positiveexpression, and the corresponding bit map image is displayed in theupper portion of field 304.

Field 512 contains a symbol indicating the appearance of an apple andaccordingly this symbol is displayed in display area 312. No data iscontained in the comprehension text field 508. Data is contained in theuser guidance text field 514 but not displayed since the user has notselected the icon 315.

If, at this stage, the text input in step 414 is as displayed in thefield 302 of FIG. 4 b (which contains the words “Trois cents grammes debeure”), the control data received in step 418 leads to the displayindicated in FIG. 4 c.

In this case, the target language text indicated in the field 306(“Voila trois cents grammes de beurre! Et avec ca?”) indicates what thecorrect word is presumed to be, but the comprehension text field 508 ofthe received control message contains the target language text,displayed in field 308, “Erreur d'orthographe! ” in a “thinks bubble”representation to indicate the thoughts of the grocer.

The expression symbol field 504 contains a symbol causing the display toa puzzled expression on the face of the grocer as shown in field 304.Since the transaction has been understood, the item (butter) isrepresented by a. symbol in the item symbol field 512 and displayed inthe area 312.

If, at this stage, the user selects the icon 315 (e.g. by a combinationof key strokes or by the user of a pointing device such as a mouse) thecontents of the user guidance (source language) text field 514 aredisplayed in the display area 314 which is overlaid over the backgrounddisplay as shown in FIG. 4 d. In this embodiment, the guidance textcontains three text fields; a first field 314 a indicating generally, inthe source language (e.g. English), what the words in the field 306mean; an error analysis display 314 b indicating, in the source language(e.g. English), the meaning of the words in the comprehension text field308 and indicating what, in this case, the spelling error is assumed tobe; and an option field 314 c containing text listing the options foruser input in response to the situation.

From the foregoing, the operation of the terminal 10 will therefore beunderstood to consist of uploading input text to the computer 20; anddownloading and acting upon control messages in response thereto fromthe computer 20.

Action of the Host Computer 20

The host computer 20 will be understood to be performing the followingfunctions:

1. Scanning the input text to determine whether it relates to one of thetransactions (e.g., in this case, sale of one of a number of differentitems) in a predetermined stored list.

2. Determining whether all the information necessary for thattransaction is complete. If so, causing the returned control message todisplay visual indications that this is the case. If not, causing thereturned control message to include output text corresponding to atarget language question designed to elucidate the missing information.

3. Spell checking and parsing the input text for apparent errors ofspelling or grammar, and causing the returned control message to includethe indicated errors.

4. Generating the user guidance text indicating, in the source language,useful information about the target language dialogue.

Because the number of transactions to be detected is relatively small innumber, the computer 20 does not need to “understand” a large number ofpossible different input text strings or their meanings; provided theinput text can be reliably associated with one of the expectedtransactions, it is necessary only to confirm whether all input wordsare correctly spelled and conform to an acceptable word order, withoutneeding to know in detail the nuances of meaning that input text maycontain.

However, the use of a set of grammar rules and a vocabulary database inthe embodiment, as discussed in greater detail below, enables thecomputer 20 to comprehend a much wider range of input texts than priorart tutoring systems which are arranged to recognized predeterminedphrases.

Referring to FIG. 8, the store 206 contains the following data:

a lexical database 208 comprising a plurality of word records 208 a, 208b . . . 208 n each comprising:

-   -   the word itself, in the target language;    -   the syntactic category of the word (e.g. whether it is a noun, a        pronoun, a verb etc);    -   the values for a number of standard features of the word        (specifically, the gender of the word, for example);    -   information (a symbol) relating to the meaning of the word; for        example, where the word is a noun or verb, the symbol may be its        translation in the source language or where the word is another        part of speech such as an article, data indicating whether it is        the definite or indefinite article and whether it is singular or        plural.

Also comprised within the store 206 is a rule database 210 comprising aplurality (e.g. 44 in this embodiment) of rules 210 a, 210 b . . . 210 neach specifying a rule of syntax structure of the target language andassociated with a particular syntactic category. For example, the rulefor a noun phrase will specify that it must comprise a noun and theassociated article, whereas the rule for a verb phrase specifies that itmust include a verb and its associated complement(s), and may include asubject, with which the form of the verb must agree, and which may(together with the object of the verb) be one of several differentsyntactic categories (e.g. a noun, a noun phrase, a pronoun and so on).

In general, rules will specify which types of words (or clauses) must bepresent in which order, and with what agreements of form, for a givensemantic structure (e.g. a question).

In many target languages (for example French) agreement between the formof words is necessary. Thus, where a noun or a pronoun has an associatedgender, then other parts of speech such as the definite or indefinitearticle, or the verb, associated with that noun or pronoun must have thesame gender.

Likewise, where a noun or pronoun is associated with a number(indicating whether it is singular or plural) then the associateddefinite or indefinite article and/or verb must be singular or plural inagreement.

Other types of agreement may also be necessary, for example, to ensurethat a word is in the correct case or tense. The need for suchagreements is recorded in the relevant rules in the rules database.

A suitable semantic representation for the rules and words stored foruse in the above embodiments may be found in “Translation using minimalrecursion semantics” by A. Coopstake, D. Flickinger, R. Malouf, S.Riehemann, and I. Sag, to appear in proceedings of the 6th InternationalConference on Theoretical and Methodological Issues in MachineTranslation (LEUVEN), currently available via the Internet athttp://hpsg.stanford.edu/hpsg/papers.html.

In order to detect simple errors, in this embodiment the rules stored inthe rules database 210 comprise, for at least some of the rules, a firstrule which specifies those agreements (for example of gender and number)which are grammatically necessary for the corresponding syntacticstructure to be correct, but also a plurality of relaxed versions of thesame rule, in each of which one or more of the agreement constraints isrelaxed.

In other words, for a first rule 210 a which specifies correct agreementof both gender and number, there are associated relaxed rules 210 b and210 c, the first of which (210 b) corresponds but lacks the requirementfor agreement of gender, and the second of which corresponds but lacksthe requirement for agreement of number.

Conveniently, the relaxed rules are stored following the correct ruleswith which they are associated.

Rather than permanently storing all inflections of each word in separateword records 208 or storing all versions of the same word within itsword record 208, conveniently an inflection table 212 is providedconsisting of a plurality of inflection records, each consisting of aword stem and, for each of a predetermined plurality of differentinflecting circumstances (such as cases, tenses and so on), the changesto the word endings of the stem.

Because many words exhibit identical inflection behaviour, the number ofrecords 212 a, 212 b in the inflection table 212 is significantlysmaller than the number of lexical records 208 a . . . 208 n in thelexical database 208. Each record in the lexical database 208 contains apointer to one of the records in the inflection table 212, and therelationship is usually many to one (that is, several words referencethe same inflection model record in the inflection table 212).

Before each use, or period of use, of the host computer 20 the CPU 204reads the lexical records 208, and expands the lexical records table 208to included a new record for each inflected version of the word, usingthe inflection table 212.

After operation of the present invention ceases, the CPU 204correspondingly deletes all such additional entries. Thus, in periodswhen the invention is not in use, memory capacity within the computer 20is conserved.

Prior to expansion, the lexical table 208 in this embodiment contains265 records.

Specific information about the transactions making up the grocer shopscenario is stored in a transaction table 214 consisting of a number ofentries 214 a, 214 b . . . 214 n.

The entries include information defining the items (e.g. apples) asbeing goods for sale, and defining units of measurement (e.g. kilos),and relating each kind of item to the units of measure in which it issold and the price per unit. Data is also stored associating each itemwith the item symbol and the graphics data representing the item (to beinitially transmitted to the terminal 10).

A response table 216 consists of a plurality of entries 216 a, 216 b . .. each corresponding to one type of output control message 500 generatedby the computer 20, and storing, for that output, the anticipated typesof response, ranked in decreasing order of likelihood.

For example, the likely responses to the opening message “Vous désirez?”are, firstly, an attempt to buy produce; secondly, an attempt to enquireabout produce (for example to ask the price).

On the other hand, the responses to the output “Et avec ca?” whichfollows a completed purchase include the above and additionally thepossibility of the end of the session, in which case a statementindicating that nothing more is sought is expected.

Likewise, if the last response was to supply price information, the nextresponse could be an attempt to complete a transaction for the subjectof the inquiry, or could be a different enquiry, or an attempt topurchase something different, or an instruction to end the session.

Each entry in the response table also includes the associated sourcelanguage response assistance text displayed in the text areas 314 a and314 c.

Each of the possible responses in the response table 216 contains apointer to an entry in a syntactic category table 218, indicating whatsyntactic category the response from the user is likely to fall into;for example, if the last output text displayed in the text area 306 asks“How many would you like?”, the answer could be a sentence including averb (“I would like three kilos please”) or a noun phrase (“Threekilos”).

Finally, a buffer 220 of most recent system outputs is stored, storingthe last, or the last few (e.g. two or three), system outputs as highlevel semantic structures. By reference to the system output buffer, itis therefore possible to determine to what the text input by the user isan attempt to respond and hence, using the response table 216, to assessthe likeliest types of response, and (by reference to the syntacticcategories table 218) the likely syntactic form in which the anticipatedresponses will expressed.

Operation of the Host Computer 20

Referring to FIG. 9, the operation of the host computer in thisembodiment will now be described in greater detail.

Referring to FIG. 9 a, in step 602, an attempt by a terminal 10 toaccess the computer 20 is detected.

In step 604, the CPU 204 accesses the stored file within the store 206storing the program to be downloaded and transmits the file (e.g. in theform of an Applet, for example in the Java (TM) programming language) tothe terminal 10.

In step 606, the CPU 204 reads the transaction data table 214 andtransmits, from each item record, the item image data file and the itemtype symbol.

The initial control message 500 sent in step 608 is predetermined, andconsists of the data shown in FIG. 4 a (and described above in relationthereto) together with the stored text for display, if required, in thefields 314 a and 314 c which is stored in the response table 216 in theentry relating to this opening system output.

Referring to FIG. 9 b, in step 610, the host computer 20 awaits a textinput from the terminal 10. On receipt, in step 611, if the languagepermits contractions such as “l'orange”, the contraction is expanded asa first step. Then, each word is compared with all the lexical entriesin the table 208. Any word not present in these tables is assumed to bea mis-spelling which may correspond to one or more valid words; if amis-spelling exists which could correspond to more than one valid word(step 614) then a node is created in the input text prior to themis-spelling and each possible corresponding valid word is recorded as anew branch in the input text in place of the mis-spelled word (step616).

If the word is not recognized even after spell correction (step 612) theword is retained and an indication of failure to recognize it is stored(step 613).

This process is repeated (step 620) until the end of the input text isreached (step 618).

If (step 622) any words were not recognized in steps 612, it will benecessary to generate an output text indicating missing words andaccordingly the process of 204 proceeds to FIG. 9 f (discussed below).Otherwise, at this stage, the input text consists entirely of wordsfound in the table 208, several of which may appear in severalalternative versions where a spelling error was detected, so as todefine, in such cases, a stored lattice of words branching before eachsuch mis-spelling into two or more alternative word paths.

The or each mis-spelling is stored prior to its replacement.

Referring to FIG. 9 c, next, in step 624, each word is looked up in theword store 208 and each possible syntactic category for each word (e.g.noun, verb) is read out, to create for each word a list of alternativeforms defining more branches in the lattice of words (step 626). Theprocess is repeated (step 630) until the end of the input text isreached (step 628).

At this point, the processor 204 selects a first path through thelattice of words thus created and reads each of the rules in the rulestore 210 in turn, and compares the word path with each set of rules.

On each comparison, if the relationships between the properties of thewords present corresponds to the relationships specified in the rules,then the syntactic category associated with the rule in question isdetected as being present, and a syntactic structure, corresponding tothat syntactic category and the words which are detected as making itup, is stored.

The CPU 204 applies the correct form of each rule (e.g. 210 a) whichspecifies the necessary agreements between all words making up thesyntactic category of the rule, and then in succession the relaxed formsof the same rule. When one of the forms of the rule is met, thesyntactic category which is the subject of the rule is deemed to bepresent, and a successful parse is recorded.

However, the CPU 204 additionally stores information on any errorencountered, by referring to the identity of the relaxed rule whichsuccessfully parsed the text; if the rule relaxes the gender agreementcriterion, for example, a gender agreement error is recorded as beingpresent between the words which were not in agreement.

The parse may pass twice (or more times) through the input text, sincesome rules may accept as their input the syntactic structures generatedin response to other rules (for example noun phrases and verb phrases).

If, after the parsing processing has concluded, it has been possible toparse the complete input text (step 636), the semantic structure thusderived is stored (step 636) and the next word path is selected (step640) until all word paths through the word lattice have been parsed(step 641).

Next, in step 644, the CPU 204 reads the output response buffer 220,notes its previous output, and looks up the entry in the response table214 associated with it. The response first read from the list is thatconsidered most likely to correspond to the last output.

Next, the CPU 204 accesses, for that response, the corresponding entryin the syntactic category table 218 (again, the first entry selectedcorresponds to that most likely to be found).

Next, in step 646 the or each semantic structure derived above as aresult of the parse of the input text is compared (steps 648-652) withthe expected response syntactic category until a match is found.

The CPU 204 first reviews the parses performed by the strict forms ofgrammatical rules and, where a complete parse is stored based on thestrict rules (i.e. with no errors recorded as being present) this isselected. Where no such parse exists, the CPU 204 then selects acomparison the or each parse including recorded errors, based on therelaxed forms of the rules.

At this point, in step 654, the CPU 204 ascertains whether the semanticstructure contains an action which could be performed. For example, thesemantic structure may correspond to:

a question which can be answered, or

a request for a sale transaction which can be met, or

an indication that a series of one or more sale transactions is nowcomplete, in which case a price total can be calculated and indicated.

In the first of these cases, the input semantic structure needs tocorrespond to a question and needs to mention the type of item of whichthe price is being asked (in this embodiment price represents the onlydatum stored in relation to each transaction, but in general otherproperties could be questioned).

In the second case, the input statement needs to specify a kind of itemto be sold and a quantity which is valid for that kind of goods (e.g.“apples” and “three kilos”). It may be phrased as a sentence in thetarget language (“I would like three kilos of apples”) or as a question(“Could I have three kilos of apples?”) or as a noun phrase (“Threekilos of apples”).

In the last case, the input text could take a number of forms, rangingfrom a word to a sentence.

If the input text does not obviously correspond to any action wouldcould be carried out, further comparisons are attempted (the CPU 204returns to step 652) and if no possible action is ultimately determined,(or if one or more words are not recognized in step 612 above) then theCPU 204 determines that the input text cannot be understood (step 656).

If, on the other hand, all the information necessary to carry out anaction (complete a purchase, answer a question etc.) is present then theCPU 204 selects that action for performance (step 658).

Finally, if it is possible to determine the nature of the action to beperformed but not to perform it, then the CPU 204 formulates (step 660)a query to elucidate the missing information for the performance of theaction.

For instance, if the input text is (in the target language) “I wouldlike to buy some apples”, the CPU 204 determines that the intendedaction is to purchase apples, accesses the record for apples in thetransaction table 214; and notes that the quantity information ismissing.

In each case, the CPU 204 is arranged to derive output text, userguidance text and an indication of suitable images for display, fortransmission to the terminal 10.

Where unrecognized words have caused the missing text not to beunderstood, the CPU 204 generates user guidance text (step 666)indicating to the user the words which have not been understood andprompting the user for replacements. In step 668, output text (in thetarget language) is generated indicating that the grocer cannotunderstand the words concerned.

The same process is performed where (step 656) the input text was notunderstood for other reasons, except that the output text and userguidance texts refer to general misunderstanding rather than specificwords.

Error Present

In the event that an action has been fully or partly possible, thesemantic structure corresponding to the action to be undertaken (forexample indicating that three kilograms of apples are to be sold, orthat a question is to be asked requesting the quantity of apples) isstored in the output buffer 220.

In the event that an action has been fully or partly possible, then instep 662 the CPU 204 determines whether spelling or grammatical errorswere entered. If so, then in step 664, the CPU 204 selects comprehensiontext consisting of one or both of the pre-stored target language phrases“Erreur d'orthographe!” or “Erreur de grammaire!”) for transmission inthe comprehension text field 508 and display in the comprehension textarea 308.

At the same time, the CPU generates source language help text fortransmission in the user guidance text field 514 and display in the userguidance area 314 b. Where the error is a spelling mistake, the textcomprises, in the source language, the words “What the tutor thinks youdid wrong is . . . I think you made a spelling mistake, (stored inputword) should be (word with which it was replaced in the successfulparse)”.

Where the error is a grammatical error, the CPU determines which rulefailed to be met, and thereby determines whether the error was an errorof gender or number, or an error of subject/verb agreement.

The text then generated is “What the tutor thinks you did wrong is . . .I think you made a grammatical mistake, try checking you have used theright (gender, number or verb form)”.

Next, in step 666 the CPU 204 selects the text to be output for the userguidance text areas 314 a and 314 c. The text for the area 314 a isobtained by looking up the stored last output in the buffer 220 andaccessing the text stored in the corresponding record 216 for thatoutput. This text describes the response selected in step 658 or thequery formulated in step 660; for example, where the action of supply ofgoods has been successfully completed (step 658) the text in field 314 awill read (in the source language) “What the shop keeper has just saidis . . . The shop keeper has supplied your goods, and is waiting for youto give him a new instruction.”

The text in the field 314 c offers the user logical response options,and is obtained by looking up the text stored with the anticipatedresponses in the field within the table 216 which relates to the actionor query just generated in step 658 or 660 and stored in the buffer 220.

Finally, in step 668, the output text field 506 to be sent in themessage 500 and displayed in the output text area 306 is generated.

The generation could take the form of simple selections of correspondingtext, as in the above described text generation stages, but it ispreferred in this embodiment to generate the output text in a freerformat, since this is likely to lead to greater variability of theresponses experienced by the user and lower memory requirements.

To achieve this, the CPU 204 utilizes the rules stored in the rule table210 and the words stored in the lexicon 208 to generate text from thehigh level response generated in steps 658 or 660. In general, theprocess is the reverse of the parsing process described above, butsimpler since the process starts from a known and deterministic semanticstructure rather than an unknown string of text.

The first stage, as shown in FIG. 9 f, is to select from the lexicontable 208 a subset of words which could be used in the output text. In astep 6681, the CPU 204 reviews the first term in the semantic structuregenerated in step 658 or 660. In a step 6682, the CPU 204 looks up, inthe lexical table 208, each word the record of which begins with thatterm.

In step 6683, the CPU 204 compares the record for the word with theoutput semantic structure. If all other terms required by the word arepresent in the output semantic structure, then in step 6684 the word isstored for possible use in text generation; if not, the next wordbeginning with that term is selected (step 6685).

When the last word is reached (step 6686), the next term is selected(step 6687) and the process is repeated until the last term is reached(step 6688), at which point all words which could contribute to thegeneration of the output text have been stored.

Next, in step 6689, the CPU 204 accesses the rules table 210 and appliesthe rules relating to the stored terms of the output semantic structureto the words selected in the preceding steps to generate output text.

Thus, where the quantity of apples required is to be queried, thesemantic structure includes a term specifying a query; a term specifyingthat the subject of the query is quantity; and a term specifying thatthe object of the query is that which an attempt was previously made topurchase; namely apples.

The words selected in steps 6681-6888 consist of the word for “apples”in the target language; and the query word or phrase which specifiesquantity. Application of the rules for construction of a query thenleads to the generation of a grammatically correctly worded question.

Returning to FIG. 9 d, in step 670 the CPU 204 transmits the controlmessage 500 formed by the above steps to the terminal 10. The CPU 204then returns to step 610 of FIG. 9 b to await the next received inputtext.

Other Embodiments and Modifications

In the foregoing, for clarity, the operations of the embodiment havebeen described in general terms, without specifying in detail the stepswhich are performed by separate programme components. In a convenientimplementation, however, the applet program would control all imagedisplaying operations, and image data would be supplied by the serverprogram on the host computer 20, rather than by the application programperforming the semantic processing.

In the foregoing embodiments, conveniently, the semantic processingperformed on the host processor 20 may be written in the Prologlanguage, and the parsing may be performed by Prolog backtracking.

It will, however, be recognized that the invention could be implementedusing any convenient hardware and/or software techniques other thanthose described above.

Equally, while a language training program has been described, it willbe recognized that the invention is applicable to other types oftraining in which it is desired to emulate the interaction of a userwith another person.

Further, it will be apparent that the terminal 10 and computer 20 couldbe located in different jurisdictions, or that parts of the inventioncould further be separated into different jurisdictions connected byappropriate communication means. Accordingly, the present inventionextends to any and all inventive subcomponents and subcombinations ofthe above described embodiments located within the jurisdiction hereof.

In the above described embodiments, text input and output have beendescribed. However, in a further embodiment, the terminal 10 may bearranged to accept input speech via a microphone and transmit the speechas a sound file to the computer 10, which is correspondingly arranged toapply a speech recognition algorithm to determine the words present inthe input.

Together, or separately, the output text generated by the grocer may besynthesised speech, and accordingly in this embodiment the computer 10comprises a text to speech synthesizer arranged to generate a sound filetransmitted to the terminal 10. In either such case, a suitable browserprogram other than the above described Netscape (TM) browser isemployed.

Other forms of input and output (for example, handwriting recognitioninput) could equally be used.

Although in the preceding embodiments the redisplay of the head portionof the grocer image has been described, it will be apparent that it maybe more convenient simply to redisplay the entire image of the grocer inother embodiments.

It will be apparent that the transactions described above need not bethose of a grocer shop. The scenario could, for example, involve aclothes shop (in which case the articles sold would comprise items ofclothing) or a butcher's shop (in which the case the items sold wouldcomprise cuts of meat). Equally, other forms of training than foreignlanguage training could be involved, in which case the scenarios couldinvolve familiarity in the source language with scenarios such asemergency or military procedures.

Accordingly, the invention is not limited by the above describedembodiments but extends to any and all such modifications andalternatives which are apparent to the skilled reader hereof.

1. Training apparatus for training a user to engage in transactions withanother person whom the apparatus is arranged to simulate, the apparatuscomprising: an output device for outputting of messages to a user; aninput for receiving input from the user; a lexical store containing datarelating to individual words of said input; a rule store containingrules specifying grammatically allowable relationships between words ofsaid input; a transaction store containing data relating to allowabletransactions between said user and said person, said data defining, forsaid output messages, types of allowable inputs from said user; anoutput message buffer for storing data representative of the most recentmessage output by the output device and at least a preceding one of saidmessages output from the output device; a processor having at least readaccess to the lexical store and the rule store, said processor beingarranged to process the input by comparing the input with the wordscontained in said lexical store and with the relationships specified bythe rules contained in said rule store, in order to recognize theoccurrence in the input of words contained in said lexical store and inthe relationships specified by the rules contained in said rule store,and, in dependence upon said recognition, to generate output indicatingwhen correct input has been recognized; and wherein said processor isfurther responsive to the data contained in the message buffer and thetransaction store to; (a) determine whether said input is an allowableresponse to a most recent one of the output messages represented by datastored in the output message buffer; and (b) if said input is notdetermined to be an allowable response to a most recent one of themessages, determine whether said input is an allowable response to apreceding message represented by data stored in the output messagebuffer; an output device for making the output available to the user sothat said user can be trained to engage in transactions with anotherperson.
 2. Apparatus according to claim 1, in which the processor isarranged to generate output responsive to input, and to detectrecognized errors in said input, and, on detection thereof, to indicatesaid recognized errors separately of said responsive output. 3.Apparatus according to claim 1 which is arranged to provide languagetraining, in which said rules, said words, and said output are in atraining target language, and further arranged to generate user guidancein a source language for said user and different to said targetlanguage.
 4. Apparatus according to claim 3 in which the user guidancecomprises guidance as to the meaning of the output.
 5. Apparatusaccording to claim 3 in which the user guidance comprises an explanationof any detected errors in the input.
 6. Apparatus according to claim 3in which the user guidance indicates suitable further input which couldbe provided.
 7. Apparatus according to claim 3, in which said userinterface comprises a display to display said output and user guidanceis normally not displayed on said display, and further comprising aninput device via which a user may selectively cause the display of saiduser guidance on said display.
 8. Apparatus according to claim 1 inwhich said input and/or said output comprise text.
 9. Apparatusaccording to claim 1, in which said input comprises speech, and furthercomprising a speech recognizer arranged to recognize the words of saidspeech.
 10. Apparatus according to claim 1 in which said outputcomprises speech, said apparatus further comprising a speechsynthesizer.
 11. Apparatus according to claim 1, further comprising auser interface arranged to accept said input and make available saidoutput to the user.
 12. Apparatus according to claim 11, in which saiduser interface comprises a display and in which said output is displayedon said display.
 13. Apparatus according to claim 11, in which said userinterface is located remotely from said processor and is coupled theretovia a communications channel.
 14. Apparatus according to claim 1,further comprising an inflection store operatively coupled to saidlexical store.
 15. Apparatus according to claim 14, wherein each recordin said lexical store contains a pointer to one of records in saidinflection store.
 16. Apparatus according to claim 14, wherein thenumber of records in the inflection store is smaller than the number ofrecords in the lexical store.
 17. Training apparatus for training a userto engage in transactions with another person whom the apparatus isarranged to simulate, the apparatus comprising: an input for receivinginput dialogue from a user; a lexical store containing data relating toindividual words of said input dialogue; a rule store containing rulesspecifying grammatically allowable relationships between words of saidinput dialogue; a transaction store containing data relating toallowable transactions between said user and said person; a processorhaving at least read access to the lexical store, the rule store and thetransaction store, said processor being arranged to process the inputdialogue by comparing the input dialogue with the words contained insaid lexical store, with the relationships specified by the rulescontained in said rule store, and with the data specified in thetransaction store, in order to recognize the occurrence in the inputdialogue of words contained in said lexical store, in the relationshipsspecified by the rules contained in said rule store, in accordance withthe data specified in the transaction store, and, in dependence uponsaid recognition, to generate output dialogue indicating when correctinput dialogue has been recognized; and an output device for making theoutput dialogue available to the user so that said user can be trainedto engage in transactions with another person; wherein said rule storecontains first rules comprising criteria specifying correctrelationships between words of said lexical store, and, associated withsaid first rules, one or more second rules each corresponding to one ofsaid first rules but with one relationship criterion relaxed, saidprocessor processing said input dialogue using both said first rules andsecond rules.
 18. Apparatus according to claim 17, wherein saidrelationship criteria correspond to agreements between words. 19.Apparatus according to claim 18, wherein said agreements between wordscomprises agreements of gender or agreements of number.
 20. Apparatusaccording to claim 17, in which said processor is arranged to detectsaid recognized errors on detection of input dialogue containing wordswhich meet said second, but not said first, rules.
 21. An interactivedialogue apparatus for simulating dialogue with a user, the apparatuscomprising: an output device for outputting messages to the user; aninput device for receiving input from the user in response to a messageoutput from the output device in order to simulate dialogue; a lexicalstore for storing data relating to individual words; a rule store forstoring rules specifying grammatically allowable relationships betweenwords of said input; a processor for processing said input to recognizeoccurrence in the input of words stored in said lexical store and in therelationships specified by the rules stored in said rule store; anoutput message buffer for storing data representative of a plurality ofmessages output to said user; and a transaction store for storing datadefining, for each of said messages, a type of allowable response; saidprocessor being responsive to an input from said user, to the datastored in the output message buffer and to the data stored in thetransaction store to; (a) determine whether said input is an allowableresponse to a most recent one of the messages represented by data storedin the output message buffer; and (b) if said input is determined not tobe an allowable response to a most recent one of the messages, determinewhether said input is an allowable response to another one of themessages represented by data stored in the output message buffer. 22.Apparatus according to claim 21, wherein the processor is arranged togenerate output responsive to input, and to detect recognized errors insaid input, and, on detection thereof, to indicate said recognizederrors separately of said responsive output.
 23. Apparatus according toclaim 21, said apparatus being arranged to provide language training, inwhich said rules, said words, and said output are in a training targetlanguage, and further being arranged to generate user guidance in asource language for said user and different to said target language. 24.An interactive dialogue apparatus for simulating dialogue with a user,the apparatus comprising: an output device for outputting messages tothe user; an input device for receiving input from the user; a lexicalstore for storing data relating to individual words; a rule store forstoring rules specifying grammatically allowable relationships betweenwords of said input; a processor for processing said input to recognizeoccurrence in the input of words stored in said lexical store and in therelationships specified by the rules stored in said rule store; anoutput message buffer for storing data representative of a plurality ofmessages output to said user; and a transaction store for storing datadefining, for each of said messages, a type of allowable response; saidprocessor being responsive to an input from said user, to the datastored in the output message buffer and to the data stored in thetransaction store to; (a) determine whether said input is an allowableresponse to a most recent one of the messages represented by data storedin the output message buffer; and (b) if said input is determined not tobe an allowable response to a most recent one of the messages, determinewhether said input is an allowable response to another one of themessages represented by data stored in the output message buffer;wherein said rule store stores first rules comprising criteriaspecifying correct relationships between words of said lexical store,and, associated with said first rules, one or more second rules eachcorresponding to one of said first rules but with one relationshipcriterion relaxed, said processor processing said input using both saidfirst rules and second rules.
 25. A method of operating an interactivedialogue apparatus for simulating dialogue with a user, the methodcomprising: outputting messages to the user; receiving input from theuser in response to a message output to the user in order to simulatedialogue; storing data relating to individual words; storing rulesspecifying grammatically allowable relationships between words of theinput; processing said input to recognize occurrence in the input ofwords related to stored data and relationships specified by the storedrules; storing message data representative of a plurality of messagesoutput to the user; and storing data defining, for each of the outputmessages, a type of allowable response; determining whether an input isan allowable response to a most recent one of the stored messagesrepresented by stored message data; and if the input is not determinedto be an allowable response to a most recent one of the messages,determining whether the input is an allowable response to another storedmessage represented by stored message data.
 26. An interactive dialogueapparatus for simulating dialogue with a user, the apparatus comprising:an output device for outputting messages to the user; an input devicefor receiving input from the user in response to a message output fromthe output device in order to simulate dialogue; a processor forprocessing said input; an output message buffer for storing a pluralityof messages output to the user; and a transaction store for storing atype of allowable response for each of the messages output to the user;wherein the processor determines whether the input is an allowableresponse to a most recent one of the messages output to the user, and ifnot, determining whether the input is an allowable response to aprevious one of the messages output to the user.
 27. Training apparatusfor training a user to engage in transactions with another person whomthe apparatus is arranged to simulate, the apparatus comprising: anoutput device for outputting of messages to a user; an input forreceiving input from the user; a lexical store containing data relatingto individual words of said input; a rule store containing rulesspecifying grammatically allowable relationships between words of saidinput; a transaction store containing data relating to allowabletransactions between said user and said person, said data defining, forsaid output messages, types of allowable inputs from said user; anoutput message buffer for storing data representative of the most recentmessage output by the output device and at least a preceding one of saidmessages output from the output device; a processor having at least readaccess to the lexical store and the rule store, said processor beingarranged to process the input by comparing the input with the wordscontained in said lexical store and with the relationships specified bythe rules contained in said rule store, in order to recognize theoccurrence in the input of words contained in said lexical store and inthe relationships specified by the rules contained in said rule store,and, in dependence upon said recognition, to generate output indicatingwhen correct input has been recognized; and wherein said processor isfurther responsive to the data contained in the message buffer and thetransaction store to; (a) determine whether said input is an allowableresponse to a most recent one of the output messages represented by datastored in the output message buffer; and (b) if said input is notdetermined to be an allowable response to a most recent one of themessages, determine whether said input is an allowable response to apreceding message represented by data stored in the output messagebuffer; and an output device for making the output available to the userso that said user can be trained to engage in transactions with anotherperson; wherein said rule store stores first rules comprising criteriaspecifying correct relationships between words of said lexical store,and, associated with said first rules, one or more second rules eachcorresponding to one of said first rules but with one relationshipcriterion relaxed, said processor processing said input using both saidfirst rules and second rules.
 28. A dialogue training apparatus fortraining a user to engage in dialogue transactions with another personwhom the apparatus is arranged to simulate, the apparatus comprising: anoutput device for outputting of messages to a user; an input forreceiving input from the user; a lexical store containing data relatingto individual words of said input; a rule store containing rulesspecifying grammatically allowable relationships between words of saidinput; a transaction store containing data relating to allowabletransactions between said user and said person, said data defining, forsaid output messages, types of allowable inputs from said user; anoutput message buffer for storing data representative of the most recentmessage output by the output device and at least a preceding one of saidmessages output from the output device; a processor having at least readaccess to the lexical store and the rule store, said processor beingarranged to process the input by comparing the input with the wordscontained in said lexical store and with the relationships specified bythe rules contained in said rule store, in order to recognize theoccurrence in the input of words contained in said lexical store and inthe relationships specified by the rules contained in said rule store,and, in dependence upon said recognition, to generate output indicatingwhen correct input has been recognized; and wherein said processor isfurther responsive to the data contained in the message buffer and thetransaction store to; (a) determine whether said input is an allowableresponse to a most recent one of the output messages represented by datastored in the output message buffer; and (b) if said input is notdetermined to be an allowable response to a most recent one of themessages, determine whether said input is an allowable response to apreceding message represented by data stored in the output messagebuffer; an output device for making the output available to the user fortraining the user to engage in dialogue transactions with anotherperson.